Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAU: Check internal and external links #701

Merged
merged 9 commits into from
May 17, 2023

Conversation

mrwilson
Copy link
Contributor

This PR introduces checking for internal links (within the GDS Way's own content) and external links (content on other sites).

We don't want to have broken links in the GDS Way as this is a bad and confusing user experience. This was highlighted in issue #700.

This PR builds on existing work in alphagov/tech-docs-template#187 and would aim to be folded into the tech docs template eventually.

This PR is also a place for discussion around:

  • updated content and new external links prompted by the link checker
  • how to handle rate-limiting by GitHub when checking links to repos, of which there are many in the project

@mrwilson mrwilson marked this pull request as draft March 31, 2022 16:14
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch 4 times, most recently from 069bdc9 to 8ac3fdf Compare September 30, 2022 15:30
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch from 8ac3fdf to 22c59f1 Compare April 26, 2023 10:09
@mrwilson mrwilson marked this pull request as ready for review April 26, 2023 10:10
Sometimes we break links (full or partial) when we move pages about or rename headers, so now html-proofer will run as part of the build to make sure we've not broken any links.
- The Reliability Engineering section for CDIO no longer exists
- Remove call-to-action to update the manual as there is no corresponding section
- Re-add "Testing with RSpec" header for ruby to unbreak a link
- Update links after changes to source-code guidance structure
Sometimes pages disappear from the internet or move, and we want to keep our outbound links in good condition. We sleep between requests to GitHub to avoid being rate-limited, and ignore some specific repositories which will fail the link checker.
- The Oracle JDK licence page does not resolve outside of a browser
- The FeatureFlags.io article is gone, replace it with a page sourced from Martin Fowler's site
- Minifycode.com is gone, replace it with a link to Google's web dev perf advice
- Replace NIST publication about least privilege with similar page from NCSC
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch 3 times, most recently from 2e741b7 to 54fd93e Compare April 26, 2023 10:21
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch from 54fd93e to 0d436ed Compare April 26, 2023 10:27
Copy link
Contributor

@36degrees 36degrees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up, @mrwilson! A few things I'm not 100% sure about – might just be because I'm missing context.

My Ruby is rusty at best, so you might want a second set of eyes on some of this, depending on your confidence in the changes.

Gemfile.lock Outdated Show resolved Hide resolved
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch 2 times, most recently from 4866aeb to e1d0e23 Compare April 27, 2023 11:33
@mrwilson mrwilson requested a review from 36degrees April 27, 2023 12:44
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch from e1d0e23 to 25b3d6e Compare April 27, 2023 13:41
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch from 25b3d6e to da9b579 Compare April 27, 2023 14:00
@domoscargin
Copy link

domoscargin commented May 2, 2023

Edit: Seems like this is probably more to ignoring Github links 'cuz of the rate-limiting but I'll keep this comment here in case it's relevant once we solve that

How much do we care about hashes if the target page at least loads?

For instance, we're not catching the fact that this link:

[stricter standards of browser support](https://github.com/alphagov/govuk-frontend#browser-support).
has updated to https://github.com/alphagov/govuk-frontend#browser-and-assistive-technology-support

Looks like we could enable check_external_hash (and check_internal_hash?) to catch these.

@mrwilson
Copy link
Contributor Author

mrwilson commented May 17, 2023

@domoscargin That's a very good shout.

According to the HTMLProofer docs, the default value for both of those flags is true, however I will add them to the configuration for the sake of explicitness and not relying on default behaviour that may change.

EDIT: Wow, it was definitely not on by default. Fix commit incoming.

EDIT 2: Internal hash checking is fine, but not adding external hash checking as that would mean our deployments could be broken by an external page changing its structure. "Does this page exist?" is a good enough heuristic for external links and will catch the majority of failures

Also remove duplication between a single GitHub URL and the blanket GitHub regex
@mrwilson mrwilson force-pushed the bau-check-internal-and-external-links branch from 58f587f to 9e2219e Compare May 17, 2023 09:07
Copy link
Contributor

@richardTowers richardTowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Only thing that I think will be slightly annoying is the inevitable "I just want to fix this typo, but to get the build working I need to fix these unrelated broken links". But that's probably better than having broken links, and I don't think an alternative cadence (e.g. daily check with a slack message, doesn't fail the build) would work given the limited ownership of GDS way.

@mrwilson mrwilson merged commit e9e8b22 into main May 17, 2023
@mrwilson mrwilson deleted the bau-check-internal-and-external-links branch May 17, 2023 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants